Untitled.ipynb
No Headings
The table of contents shows headings in notebooks and supported files.
- File
- Edit
- View
- Run
- Kernel
- Settings
- Help
- Open in...
Kernel status: Idle Executed 2 cellsElapsed time: 2 seconds
[1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
#to ignore warnings
import warnings
warnings.filterwarnings('ignore')
data = pd.read_csv("used_cars_data.csv")
[3]:
data.head()
[3]:
| S.No. | Name | Location | Year | Kilometers_Driven | Fuel_Type | Transmission | Owner_Type | Mileage | Engine | Power | Seats | New_Price | Price | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0 | Maruti Wagon R LXI CNG | Mumbai | 2010 | 72000 | CNG | Manual | First | 26.6 km/kg | 998 CC | 58.16 bhp | 5.0 | NaN | 1.75 |
| 1 | 1 | Hyundai Creta 1.6 CRDi SX Option | Pune | 2015 | 41000 | Diesel | Manual | First | 19.67 kmpl | 1582 CC | 126.2 bhp | 5.0 | NaN | 12.50 |
| 2 | 2 | Honda Jazz V | Chennai | 2011 | 46000 | Petrol | Manual | First | 18.2 kmpl | 1199 CC | 88.7 bhp | 5.0 | 8.61 Lakh | 4.50 |
| 3 | 3 | Maruti Ertiga VDI | Chennai | 2012 | 87000 | Diesel | Manual | First | 20.77 kmpl | 1248 CC | 88.76 bhp | 7.0 | NaN | 6.00 |
| 4 | 4 | Audi A4 New 2.0 TDI Multitronic | Coimbatore | 2013 | 40670 | Diesel | Automatic | Second | 15.2 kmpl | 1968 CC | 140.8 bhp | 5.0 | NaN | 17.74 |
[5]:
data.tail()
[5]:
| S.No. | Name | Location | Year | Kilometers_Driven | Fuel_Type | Transmission | Owner_Type | Mileage | Engine | Power | Seats | New_Price | Price | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 7248 | 7248 | Volkswagen Vento Diesel Trendline | Hyderabad | 2011 | 89411 | Diesel | Manual | First | 20.54 kmpl | 1598 CC | 103.6 bhp | 5.0 | NaN | NaN |
| 7249 | 7249 | Volkswagen Polo GT TSI | Mumbai | 2015 | 59000 | Petrol | Automatic | First | 17.21 kmpl | 1197 CC | 103.6 bhp | 5.0 | NaN | NaN |
| 7250 | 7250 | Nissan Micra Diesel XV | Kolkata | 2012 | 28000 | Diesel | Manual | First | 23.08 kmpl | 1461 CC | 63.1 bhp | 5.0 | NaN | NaN |
| 7251 | 7251 | Volkswagen Polo GT TSI | Pune | 2013 | 52262 | Petrol | Automatic | Third | 17.2 kmpl | 1197 CC | 103.6 bhp | 5.0 | NaN | NaN |
| 7252 | 7252 | Mercedes-Benz E-Class 2009-2013 E 220 CDI Avan... | Kochi | 2014 | 72443 | Diesel | Automatic | First | 10.0 kmpl | 2148 CC | 170 bhp | 5.0 | NaN | NaN |
[7]:
data.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 7253 entries, 0 to 7252 Data columns (total 14 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 S.No. 7253 non-null int64 1 Name 7253 non-null object 2 Location 7253 non-null object 3 Year 7253 non-null int64 4 Kilometers_Driven 7253 non-null int64 5 Fuel_Type 7253 non-null object 6 Transmission 7253 non-null object 7 Owner_Type 7253 non-null object 8 Mileage 7251 non-null object 9 Engine 7207 non-null object 10 Power 7207 non-null object 11 Seats 7200 non-null float64 12 New_Price 1006 non-null object 13 Price 6019 non-null float64 dtypes: float64(2), int64(3), object(9) memory usage: 793.4+ KB
[9]:
S.No. 7253 Name 2041 Location 11 Year 23 Kilometers_Driven 3660 Fuel_Type 5 Transmission 2 Owner_Type 4 Mileage 450 Engine 150 Power 386 Seats 9 New_Price 625 Price 1373 dtype: int64
[11]:
S.No. 0 Name 0 Location 0 Year 0 Kilometers_Driven 0 Fuel_Type 0 Transmission 0 Owner_Type 0 Mileage 2 Engine 46 Power 46 Seats 53 New_Price 6247 Price 1234 dtype: int64
[13]:
S.No. 0.000000 Name 0.000000 Location 0.000000 Year 0.000000 Kilometers_Driven 0.000000 Fuel_Type 0.000000 Transmission 0.000000 Owner_Type 0.000000 Mileage 0.027575 Engine 0.634220 Power 0.634220 Seats 0.730732 New_Price 86.129877 Price 17.013650 dtype: float64
<class 'pandas.core.frame.DataFrame'> RangeIndex: 7253 entries, 0 to 7252 Data columns (total 13 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 Name 7253 non-null object 1 Location 7253 non-null object 2 Year 7253 non-null int64 3 Kilometers_Driven 7253 non-null int64 4 Fuel_Type 7253 non-null object 5 Transmission 7253 non-null object 6 Owner_Type 7253 non-null object 7 Mileage 7251 non-null object 8 Engine 7207 non-null object 9 Power 7207 non-null object 10 Seats 7200 non-null float64 11 New_Price 1006 non-null object 12 Price 6019 non-null float64 dtypes: float64(2), int64(2), object(9) memory usage: 736.8+ KB
[17]:
| Name | Location | Year | Kilometers_Driven | Fuel_Type | Transmission | Owner_Type | Mileage | Engine | Power | Seats | New_Price | Price | Car_Age | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | Maruti Wagon R LXI CNG | Mumbai | 2010 | 72000 | CNG | Manual | First | 26.6 km/kg | 998 CC | 58.16 bhp | 5.0 | NaN | 1.75 | 15 |
| 1 | Hyundai Creta 1.6 CRDi SX Option | Pune | 2015 | 41000 | Diesel | Manual | First | 19.67 kmpl | 1582 CC | 126.2 bhp | 5.0 | NaN | 12.50 | 10 |
| 2 | Honda Jazz V | Chennai | 2011 | 46000 | Petrol | Manual | First | 18.2 kmpl | 1199 CC | 88.7 bhp | 5.0 | 8.61 Lakh | 4.50 | 14 |
| 3 | Maruti Ertiga VDI | Chennai | 2012 | 87000 | Diesel | Manual | First | 20.77 kmpl | 1248 CC | 88.76 bhp | 7.0 | NaN | 6.00 | 13 |
| 4 | Audi A4 New 2.0 TDI Multitronic | Coimbatore | 2013 | 40670 | Diesel | Automatic | Second | 15.2 kmpl | 1968 CC | 140.8 bhp | 5.0 | NaN | 17.74 | 12 |
[23]:
| Name | Brand | Model | |
|---|---|---|---|
| 0 | Maruti Wagon R LXI CNG | Maruti | WagonR |
| 1 | Hyundai Creta 1.6 CRDi SX Option | Hyundai | Creta1.6 |
| 2 | Honda Jazz V | Honda | JazzV |
| 3 | Maruti Ertiga VDI | Maruti | ErtigaVDI |
| 4 | Audi A4 New 2.0 TDI Multitronic | Audi | A4New |
| ... | ... | ... | ... |
| 7248 | Volkswagen Vento Diesel Trendline | Volkswagen | VentoDiesel |
| 7249 | Volkswagen Polo GT TSI | Volkswagen | PoloGT |
| 7250 | Nissan Micra Diesel XV | Nissan | MicraDiesel |
| 7251 | Volkswagen Polo GT TSI | Volkswagen | PoloGT |
| 7252 | Mercedes-Benz E-Class 2009-2013 E 220 CDI Avan... | Mercedes-Benz | E-Class2009-2013 |
7253 rows × 3 columns
['Maruti' 'Hyundai' 'Honda' 'Audi' 'Nissan' 'Toyota' 'Volkswagen' 'Tata' 'Land' 'Mitsubishi' 'Renault' 'Mercedes-Benz' 'BMW' 'Mahindra' 'Ford' 'Porsche' 'Datsun' 'Jaguar' 'Volvo' 'Chevrolet' 'Skoda' 'Mini' 'Fiat' 'Jeep' 'Smart' 'Ambassador' 'Isuzu' 'ISUZU' 'Force' 'Bentley' 'Lamborghini' 'Hindustan' 'OpelCorsa'] 33
[27]:
| Name | Location | Year | Kilometers_Driven | Fuel_Type | Transmission | Owner_Type | Mileage | Engine | Power | Seats | New_Price | Price | Car_Age | Brand | Model | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 13 | Land Rover Range Rover 2.2L Pure | Delhi | 2014 | 72000 | Diesel | Automatic | First | 12.7 kmpl | 2179 CC | 187.7 bhp | 5.0 | NaN | 27.00 | 11 | Land | RoverRange |
| 14 | Land Rover Freelander 2 TD4 SE | Pune | 2012 | 85000 | Diesel | Automatic | Second | 0.0 kmpl | 2179 CC | 115 bhp | 5.0 | NaN | 17.50 | 13 | Land | RoverFreelander |
| 176 | Mini Countryman Cooper D | Jaipur | 2017 | 8525 | Diesel | Automatic | Second | 16.6 kmpl | 1998 CC | 112 bhp | 5.0 | NaN | 23.00 | 8 | Mini | CountrymanCooper |
| 191 | Land Rover Range Rover 2.2L Dynamic | Coimbatore | 2018 | 36091 | Diesel | Automatic | First | 12.7 kmpl | 2179 CC | 187.7 bhp | 5.0 | NaN | 55.76 | 7 | Land | RoverRange |
| 228 | Mini Cooper Convertible S | Kochi | 2017 | 26327 | Petrol | Automatic | First | 16.82 kmpl | 1998 CC | 189.08 bhp | 4.0 | 44.28 Lakh | 35.67 | 8 | Mini | CooperConvertible |
[31]:
| count | mean | std | min | 25% | 50% | 75% | max | |
|---|---|---|---|---|---|---|---|---|
| Year | 7253.0 | 2013.365366 | 3.254421 | 1996.00 | 2011.0 | 2014.00 | 2016.00 | 2019.0 |
| Kilometers_Driven | 7253.0 | 58699.063146 | 84427.720583 | 171.00 | 34000.0 | 53416.00 | 73000.00 | 6500000.0 |
| Seats | 7200.0 | 5.279722 | 0.811660 | 0.00 | 5.0 | 5.00 | 5.00 | 10.0 |
| Price | 6019.0 | 9.479468 | 11.187917 | 0.44 | 3.5 | 5.64 | 9.95 | 160.0 |
| Car_Age | 7253.0 | 11.634634 | 3.254421 | 6.00 | 9.0 | 11.00 | 14.00 | 29.0 |
[33]:
| count | unique | top | freq | mean | std | min | 25% | 50% | 75% | max | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| Name | 7253 | 2041 | Mahindra XUV500 W8 2WD | 55 | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| Location | 7253 | 11 | Mumbai | 949 | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| Year | 7253.0 | NaN | NaN | NaN | 2013.365366 | 3.254421 | 1996.0 | 2011.0 | 2014.0 | 2016.0 | 2019.0 |
| Kilometers_Driven | 7253.0 | NaN | NaN | NaN | 58699.063146 | 84427.720583 | 171.0 | 34000.0 | 53416.0 | 73000.0 | 6500000.0 |
| Fuel_Type | 7253 | 5 | Diesel | 3852 | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| Transmission | 7253 | 2 | Manual | 5204 | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| Owner_Type | 7253 | 4 | First | 5952 | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| Mileage | 7251 | 450 | 17.0 kmpl | 207 | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| Engine | 7207 | 150 | 1197 CC | 732 | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| Power | 7207 | 386 | 74 bhp | 280 | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| Seats | 7200.0 | NaN | NaN | NaN | 5.279722 | 0.81166 | 0.0 | 5.0 | 5.0 | 5.0 | 10.0 |
| New_Price | 1006 | 625 | 63.71 Lakh | 6 | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| Price | 6019.0 | NaN | NaN | NaN | 9.479468 | 11.187917 | 0.44 | 3.5 | 5.64 | 9.95 | 160.0 |
| Car_Age | 7253.0 | NaN | NaN | NaN | 11.634634 | 3.254421 | 6.0 | 9.0 | 11.0 | 14.0 | 29.0 |
| Brand | 7253 | 32 | Maruti | 1444 | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| Model | 7252 | 726 | SwiftDzire | 189 | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
Categorical Variables:
Index(['Name', 'Location', 'Fuel_Type', 'Transmission', 'Owner_Type',
'Mileage', 'Engine', 'Power', 'New_Price', 'Brand', 'Model'],
dtype='object')
Numerical Variables:
['Year', 'Kilometers_Driven', 'Seats', 'Price', 'Car_Age']
Year Skew : -0.84
Kilometers_Driven Skew : 61.58
Seats Skew : 1.9
Price Skew : 3.34
Car_Age Skew : 0.84
<class 'pandas.core.frame.DataFrame'> RangeIndex: 7253 entries, 0 to 7252 Data columns (total 18 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 Name 7253 non-null object 1 Location 7253 non-null object 2 Year 7253 non-null int64 3 Kilometers_Driven 7253 non-null int64 4 Fuel_Type 7253 non-null object 5 Transmission 7253 non-null object 6 Owner_Type 7253 non-null object 7 Mileage 7251 non-null object 8 Engine 7207 non-null object 9 Power 7207 non-null object 10 Seats 7200 non-null float64 11 New_Price 1006 non-null object 12 Price 6019 non-null float64 13 Car_Age 7253 non-null int64 14 Brand 7253 non-null object 15 Model 7252 non-null object 16 Kilometers_Driven_log 7253 non-null float64 17 Price_log 6019 non-null float64 dtypes: float64(4), int64(3), object(11) memory usage: 1020.1+ KB
[47]:
plt.figure(figsize=(13,17))
sns.pairplot(data=data.drop(['Kilometers_Driven','Price'],axis=1))
plt.show()
<Figure size 1300x1700 with 0 Axes>
[53]:
2
[57]:
--------------------------------------------------------------------------- ValueError Traceback (most recent call last) File ~\anaconda3\Lib\site-packages\pandas\core\frame.py:12687, in _reindex_for_setitem(value, index) 12686 try: > 12687 reindexed_value = value.reindex(index)._values 12688 except ValueError as err: 12689 # raised in MultiIndex.from_tuples, see test_insert_error_msmgs File ~\anaconda3\Lib\site-packages\pandas\core\series.py:5153, in Series.reindex(self, index, axis, method, copy, level, fill_value, limit, tolerance) 5136 @doc( 5137 NDFrame.reindex, # type: ignore[has-type] 5138 klass=_shared_doc_kwargs["klass"], (...) 5151 tolerance=None, 5152 ) -> Series: -> 5153 return super().reindex( 5154 index=index, 5155 method=method, 5156 copy=copy, 5157 level=level, 5158 fill_value=fill_value, 5159 limit=limit, 5160 tolerance=tolerance, 5161 ) File ~\anaconda3\Lib\site-packages\pandas\core\generic.py:5610, in NDFrame.reindex(self, labels, index, columns, axis, method, copy, level, fill_value, limit, tolerance) 5609 # perform the reindex on the axes -> 5610 return self._reindex_axes( 5611 axes, level, limit, tolerance, method, fill_value, copy 5612 ).__finalize__(self, method="reindex") File ~\anaconda3\Lib\site-packages\pandas\core\generic.py:5633, in NDFrame._reindex_axes(self, axes, level, limit, tolerance, method, fill_value, copy) 5632 ax = self._get_axis(a) -> 5633 new_index, indexer = ax.reindex( 5634 labels, level=level, limit=limit, tolerance=tolerance, method=method 5635 ) 5637 axis = self._get_axis_number(a) File ~\anaconda3\Lib\site-packages\pandas\core\indexes\base.py:4433, in Index.reindex(self, target, method, level, limit, tolerance) 4431 indexer, _ = self.get_indexer_non_unique(target) -> 4433 target = self._wrap_reindex_result(target, indexer, preserve_names) 4434 return target, indexer File ~\anaconda3\Lib\site-packages\pandas\core\indexes\multi.py:2717, in MultiIndex._wrap_reindex_result(self, target, indexer, preserve_names) 2716 try: -> 2717 target = MultiIndex.from_tuples(target) 2718 except TypeError: 2719 # not all tuples, see test_constructor_dict_multiindex_reindex_flat File ~\anaconda3\Lib\site-packages\pandas\core\indexes\multi.py:222, in names_compat.<locals>.new_meth(self_or_cls, *args, **kwargs) 220 kwargs["names"] = kwargs.pop("name") --> 222 return meth(self_or_cls, *args, **kwargs) File ~\anaconda3\Lib\site-packages\pandas\core\indexes\multi.py:617, in MultiIndex.from_tuples(cls, tuples, sortorder, names) 615 tuples = np.asarray(tuples._values) --> 617 arrays = list(lib.tuples_to_object_array(tuples).T) 618 elif isinstance(tuples, list): File lib.pyx:3029, in pandas._libs.lib.tuples_to_object_array() ValueError: Buffer dtype mismatch, expected 'Python object' but got 'long long' The above exception was the direct cause of the following exception: TypeError Traceback (most recent call last) ~\AppData\Local\Temp\ipykernel_6964\2680079267.py in ?() 1 data.Seats.isnull().sum() 2 data['Seats'].fillna(value=np.nan,inplace=True) ----> 3 data['Seats']=data.groupby(['Model','Brand'])['Seats'].apply(lambda x:x.fillna(x.median())) 4 data['Engine']=data.groupby(['Brand','Model'])['Engine'].apply(lambda x:x.fillna(x.median())) 5 data['Power']=data.groupby(['Brand','Model'])['Power'].apply(lambda x:x.fillna(x.median())) ~\anaconda3\Lib\site-packages\pandas\core\frame.py in ?(self, key, value) 4307 # Column to set is duplicated 4308 self._setitem_array([key], value) 4309 else: 4310 # set column -> 4311 self._set_item(key, value) ~\anaconda3\Lib\site-packages\pandas\core\frame.py in ?(self, key, value) 4520 4521 Series/TimeSeries will be conformed to the DataFrames index to 4522 ensure homogeneity. 4523 """ -> 4524 value, refs = self._sanitize_column(value) 4525 4526 if ( 4527 key in self.columns ~\anaconda3\Lib\site-packages\pandas\core\frame.py in ?(self, value) 5259 assert not isinstance(value, DataFrame) 5260 if is_dict_like(value): 5261 if not isinstance(value, Series): 5262 value = Series(value) -> 5263 return _reindex_for_setitem(value, self.index) 5264 5265 if is_list_like(value): 5266 com.require_length_match(value, self.index) ~\anaconda3\Lib\site-packages\pandas\core\frame.py in ?(value, index) 12690 if not value.index.is_unique: 12691 # duplicate axis 12692 raise err 12693 > 12694 raise TypeError( 12695 "incompatible index of inserted column with frame index" 12696 ) from err 12697 return reindexed_value, None TypeError: incompatible index of inserted column with frame index
-
Variables
Callstack
Breakpoints
Source
9
1
Kernel Sources
Common Tools
No metadata.
Advanced Tools
No metadata.
Anaconda Assistant
AI-powered coding, insights and debugging in your notebooks.
To enable the following extensions, create an account or sign in.
- Anaconda Assistant4.1.0
- Coming soon!
- Data Catalogs
- Panel Deployments
- Sharing
Already have an account? Sign In
For more information, read our Anaconda Assistant documentation.
